A Methodology for Profiling and Partitioning Stream Programs on Many-core Architectures
نویسندگان
چکیده
Maximizing the data throughput is a very common implementation objective for several streaming applications. Such task is particularly challenging for implementations based on many-core and multi-core target platforms because, in general, it implies tackling several NPcomplete combinatorial problems. Moreover, an efficient design space exploration requires an accurate evaluation on the basis of dataflow program execution profiling. The focus of the paper is on the methodology challenges for obtaining accurate profiling measures. Experimental results validate a many-core platform built by an array of Transport Triggered Architecture processors for exploring the partitioning search space based on the execution trace analysis.
منابع مشابه
Execution Trace Graph Based Multi-criteria Partitioning of Stream Programs
One of the problems proven to be NP-hard in the field of many-core architectures is the partitioning of stream programs. In order to maximize the execution parallelism and obtain the maximal data throughput for a streaming application it is essential to find an appropriate actors assignment. The paper proposes a novel approach for finding a close-to-optimal partitioning configuration which is b...
متن کاملMachine learning based mapping of data and streaming parallelism to multi-cores
Multi-core processors are now ubiquitous and are widely seen as the most viable means of delivering performance with increasing transistor densities. However, this potential can only be realised if the application programs are suitably parallel. Applications can either be written in parallel from scratch or converted from existing sequential programs. Regardless of how applications are parallel...
متن کاملRA-LPEL: a Resource-Aware Light-weight Parallel Execution Layer for reactive stream processing networks on the SCC many-core tiled architecture
In computing the available computing power has continuously fallen short of the demanded computing performance. As a consequence, performance improvement has been the main focus of processor design. However, due to the phenomenon called “Power Wall” it has become infeasible to build faster processors by just increasing the processor’s clock speed. One of the resulting trends in hardware design ...
متن کاملUltra-Low-Energy DSP Processor Design for Many-Core Parallel Applications
Background and Objectives: Digital signal processors are widely used in energy constrained applications in which battery lifetime is a critical concern. Accordingly, designing ultra-low-energy processors is a major concern. In this work and in the first step, we propose a sub-threshold DSP processor. Methods: As our baseline architecture, we use a modified version of an existing ultra-low-power...
متن کاملA Profiling Framework for Design Space Exploration in Heterogeneous System Context
Design of embedded systems is subject to different types of design constraints such as execution cycles, power consumption, and memory consumption/bandwidth. At the same time, modern computing systems make increasing use of reconfigurable and heterogeneous architectures. The increasing heterogeneous nature of embedded system platform and the application makes the design of embedded system very ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2015